Multi-Criteria-Based Strategy to Stop Active Learning for Data Annotation

نویسندگان

  • Jingbo Zhu
  • Huizhen Wang
  • Eduard H. Hovy
چکیده

In this paper, we address the issue of deciding when to stop active learning for building a labeled training corpus. Firstly, this paper presents a new stopping criterion, classification-change, which considers the potential ability of each unlabeled example on changing decision boundaries. Secondly, a multi-criteriabased combination strategy is proposed to solve the problem of predefining an appropriate threshold for each confidence-based stopping criterion, such as max-confidence, min-error, and overalluncertainty. Finally, we examine the effectiveness of these stopping criteria on uncertainty sampling and heterogeneous uncertainty sampling for active learning. Experimental results show that these stopping criteria work well on evaluation data sets, and the combination strategies outperform individual criteria.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Effects of Multimedia Annotations on Iranian EFL Learners’ L2 Vocabulary Learning

In our modern technological world, Computer-Assisted Language learning (CALL) is a new realm towards learning a language in general, and learning L2 vocabulary in particular. It is assumed that the use of multimedia annotations promotes language learners’ vocabulary acquisition. Therefore, this study set out to investigate the effects of different multimedia annotations (still picture annotatio...

متن کامل

Multi-criteria-based Active Learning for Named Entity Recognition

In this thesis, we propose a multi-criteria-based active learning approach and effectively apply it to the task of named entity recognition. Active learning targets to minimize the human annotation efforts to learn a model with the same performance level as supervised learning by selecting the most useful examples for labeling. To maximize the contribution of the selected examples, we consider ...

متن کامل

Multi-Criteria-based Active Learning for Named Entity Recognition

In this paper, we propose a multi-criteria based active learning approach and effectively apply it to named entity recognition. Active learning targets to minimize the human annotation efforts by selecting examples for labeling. To maximize the contribution of the selected examples, we consider the multiple criteria: informativeness, representativeness and diversity and propose measures to quan...

متن کامل

Using Variance as a Stopping Criterion for Active Learning of Frame Assignment

Active learning is a promising method to reduce human’s effort for data annotation in different NLP applications. Since it is an iterative task, it should be stopped at some point which is optimum or near-optimum. In this paper we propose a novel stopping criterion for active learning of frame assignment based on the variability of the classifier’s confidence score on the unlabeled data. The im...

متن کامل

Combining Active Learning and Partial Annotation for Japanese Dependency Parsing

The machine learning-based approaches that dominate natural language processing research require massive amounts of labeled training data. Active learning has the potential to substantially reduce the human effort needed to prepare this data by allowing annotators to focus on only the most informative training examples. This paper shows how active learning can be used for domain adaptation of d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008